{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **演示0402：向量求导**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **案例1：求和公式的导数计算**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **示例1**：设集合$X$有$n$个元素${x_1, x_2, \\cdots, x_n}$，集合$Y$有$n$个元素${y_1, y_2, \\cdots, y_n}$，现有变量$a$和$b$及函数：\n",
    "$ f(a,b)=(ax_1+b- y_1 )+(ax_2+b- y_2 )+ \\cdots +(ax_n+b- y_n )= \\sum_{i=1}^{n}(ax_i+b-y_i ) $  \n",
    "计算$ \\dfrac{\\partial f}{\\partial a}$$ 和 $$\\dfrac{\\partial f}{\\partial b}$  \n",
    "**解答：**\n",
    "在计算时，请注意：\n",
    "* $x_i$和$y_i$ 都应视为常量。并不是要针对它们来求导\n",
    "* 把$a$和$b$视为求导自变量\n",
    "* 遵守复合函数的导数计算法则  \n",
    "$ \\dfrac{\\partial f}{\\partial a}=(x_1+0-0)+(x_2+0-0)+\\cdots+(x_n+0-0)=\\sum_{i=1}^{n}x_i $  \n",
    "$ \\dfrac{\\partial f}{\\partial b}=(0+1-0)+(0+1-0)+\\cdots+(0+1-0)=\\sum_{i=1}^{n}1=n $"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">**示例2**：已知：$ f(a,b)=\\frac{1}{2n} \\sum_{i=1}^{n}(b+ax_i-y_i )^2 $  \n",
    "计算：$ \\dfrac{\\partial f}{\\partial a} $$ 和 $$ \\dfrac{\\partial f}{\\partial b} $  \n",
    "**解答：**  \n",
    "$ \\dfrac{\\partial f}{\\partial a}=\\dfrac{1}{2n} \\sum_{i=1}^{n}[2 \\cdot (b+ax_i-y_i ) \\cdot \\dfrac{\\partial{(b+ax_i-y_i)}}{\\partial a}]=\\dfrac{1}{n} \\sum_{i=1}^{n}[(b+ax_i-y_i ) \\cdot x_i] $  \n",
    "$ \\dfrac{\\partial f}{\\partial b}=\\dfrac{1}{2n} \\sum_{i=1}^{n}[2 \\cdot (b+ax_i-y_i ) \\cdot \\dfrac{\\partial{(b+ax_i-y_i)}}{\\partial b}]=\\dfrac{1}{n} \\sum_{i=1}^{n}(b+ax_i-y_i) $"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **案例2：标量对向量的求导**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **示例1：**已知$m$维行向量$w$和$m$维列向量$x$，定义函数：$f(w, x)=wx$  \n",
    "计算：$\\dfrac{\\partial f}{\\partial w}$和$\\dfrac{\\partial f}{\\partial x} $  \n",
    "**解答：**  \n",
    "$f$是标量，$w$是向量，因此$\\dfrac{\\partial f}{\\partial w} $可以分解成$f$对$w$中的每一个分量分别求偏导，然后再组合成向量：  \n",
    "$ \\begin{aligned}\n",
    "\\frac{\\partial f}{\\partial w} & =(\\dfrac{\\partial f}{\\partial w_1} ,\\dfrac{\\partial f}{\\partial w_2},\\cdots,\\dfrac{\\partial f}{\\partial w_m})\n",
    "\\\\ \\\\ &=(\\dfrac{\\partial{(w_1 x_1+w_2 x_2+\\cdots+w_m x_m)}}{\\partial w_1},\\dfrac{\\partial{(w_1 x_1+w_2 x_2+\\cdots+w_m x_m)}}{\\partial w_2},\\cdots,\\dfrac{\\partial{(w_1 x_1+w_2 x_2+\\cdots+w_m x_m)}}{\\partial w_m})\n",
    "\\end{aligned} $  \n",
    "在针对$w$的每个分量求导数时，其它分量均视为常数，因此：  \n",
    "$ \\dfrac{\\partial f}{\\partial w}=(x_1,x_2,\\cdots,x_m)=x^T $  \n",
    "类似的，可得： \n",
    "$ \\dfrac{\\partial f}{\\partial x}=\\left(\\begin{array}{c}w_1 \\\\ w_2 \\\\ \\vdots \\\\ w_m\\end{array}\\right)=w^T $"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">**示例2：**设$w$为$m$维行向量，包含标量元素${w_1, w_2, \\cdots, w_m}$，函数$f$定义如下：$f(w)=ww^T$  \n",
    "计算：$\\dfrac{\\partial f}{\\partial w} $  \n",
    "**解答：**  \n",
    "$ \\begin{aligned}\n",
    "\\dfrac{\\partial f}{\\partial w} & =(\\frac{\\partial f}{\\partial w_1},\\frac{\\partial f}{\\partial w_2},\\cdots,\\frac{\\partial f}{\\partial w_m}) \\\\ \\\\\n",
    "& =(\\dfrac{\\partial {(w_1^2+w_2^2+\\cdots+w_m^2)}}{\\partial w_1},\\dfrac{\\partial {(w_1^2+w_2^2+\\cdots+w_m^2)}}{\\partial w_2},\\cdots,\\dfrac{\\partial {(w_1^2+w_2^2+\\cdots+w_m^2)}}{\\partial w_m}) \\\\ \\\\\n",
    "& =(\\dfrac{\\partial w_1^2}{\\partial w_1},\\frac{\\partial w_2^2}{\\partial w_2},\\cdots,\\frac{\\partial w_m^2}{\\partial w_m}) \\\\ \\\\\n",
    "& =(2w_1,2w_2,\\cdots,2w_m) \\\\ \\\\\n",
    "& =2w\n",
    "\\end{aligned}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **案例3：向量对向量求导**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">**示例1：**已知$(m,n)$矩阵$W$和$n$维列向量$x$，定义函数：$f(W, x)=Wx$  \n",
    "计算：$\\dfrac{\\partial f}{\\partial x}$  \n",
    "**解答：**  \n",
    "$f=Wx=\\left(\\begin{array}{c}W_{11}*x_1+W_{12}*x_2+\\cdots+W_{1n}*x_n \\\\ W_{21}*x_1+W_{22}*x_2+\\cdots+W_{2n}*x_n \\\\ \\vdots \\\\ W_{m1}*x_1+W_{m2}*x_2+\\cdots+W_{mn}*x_n \\end{array}\\right)$    \n",
    "可见是一个$m$维列向量，而$x$是一个$n$维列向量，$\\dfrac{\\partial f}{\\partial x}$可视为：\n",
    "* $f$中的第1个分量$f_1$，对列向量$x$求导，得到列向量$\\dfrac{df_1}{dx}=\\dfrac{W_{11}*x_1+W_{12}*x_2+\\cdots+W_{1n}*x_n}{dx}=(W_{11},W_{12},\\cdots,W_{1n})^T$，注意结果是个列向量\n",
    "* $f$中的第$i$个分量$f_i$，对列向量$x$求导，得到列向量$\\dfrac{df_i}{dx}=\\dfrac{W_{i1}*x_1+W_{i2}*x_2+\\cdots+W_{in}*x_n}{dx}=(W_{i1},W_{i2},\\cdots,W_{in})^T$，注意结果是个列向量\n",
    "* 共有$m$个列向量，最终生成$(n, m)$维矩阵：  \n",
    "$\\begin{aligned} \\\\\n",
    "\\frac{\\partial f}{\\partial x}=\n",
    "\\left(\\begin{matrix}\n",
    "W_{11} & W_{21} & \\cdots & W_{m1} \\\\\n",
    "W_{12} & W_{2n} & \\cdots & W_{mn} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "W_{1n} & W_{2n} & \\cdots & W_{mn} \\\\\n",
    "\\end{matrix}\\right)\n",
    "\\end{aligned}=W^T\n",
    "$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **案例4：向量对矩阵求导**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">**示例1：**已知$(m,n)$矩阵$W$和$n$维列向量$x$，定义函数：$f(W, x)=Wx$  \n",
    "计算：$\\dfrac{\\partial f}{\\partial W}$  \n",
    "**解答:**\n",
    "* 考虑$\\dfrac{\\partial f}{\\partial W} $：$f$是($m$个元素的)列向量，因此$\\dfrac{\\partial f}{\\partial W}$ 可以先分解成$f$中的每一个元素对$W$求偏导：  \n",
    "$\\dfrac{\\partial f}{\\partial W}=\\left(\\begin{array}{c}\\dfrac{\\partial f_1}{\\partial W} \\\\ \\dfrac{\\partial f_2}{\\partial W} \\\\ \\vdots \\\\ \\dfrac{\\partial f_m}{\\partial W}\\end{array}\\right) $\n",
    "* 考虑$ \\dfrac{\\partial f_i}{\\partial W} $：$W$是$(m,n)$矩阵，$f_i$是标量，因此$\\dfrac{\\partial f_i}{\\partial W}$可以分解成$f_i$对$W$中的每一个元素求偏导：  \n",
    "$ \\begin{aligned} \\\\\n",
    "\\frac{\\partial f_i}{\\partial W}=\n",
    "\\left(\\begin{matrix}\n",
    "\\frac{\\partial f_i}{\\partial W_{11}} & \\frac{\\partial f_i}{\\partial W_{12}} & ... & \\frac{\\partial f_i}{\\partial W_{1n}} \\\\\n",
    "\\frac{\\partial f_i}{\\partial W_{21}} & \\frac{\\partial f_i}{\\partial W_{22}} & ... & \\frac{\\partial f_i}{\\partial W_{2n}} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\frac{\\partial f_i}{\\partial W_{m1}} & \\frac{\\partial f_i}{\\partial W_{m2}} & ... & \\frac{\\partial f_i}{\\partial W_{mn}} \\\\\n",
    "\\end{matrix}\\right)\n",
    "\\end{aligned}\n",
    "$\n",
    "* 考虑$\\dfrac{\\partial f_i}{\\partial W_{kj}}$：只需要考虑$f_i$中来源于$W_{kj}$元素的那些项(其余项针对$W_{kj}$的导数都为0)。而根据$f=Wx$，$f_i=W_{i1}*x_1+W_{i2}*x_2+\\cdots+W_{in}*x_n$的值仅仅与$W$中的第$i$行有关，因此:\n",
    " * 当$i \\neq k$时，$ \\dfrac{\\partial f_i}{\\partial W_{kj}} = 0 $\n",
    " * 当$i = k$时，$ \\dfrac{\\partial f_i}{\\partial W_{kj}} = x_j $  \n",
    "注意$\\dfrac{\\partial f_i}{\\partial W_{kj}}$和$x_j$都是标量  \n",
    "* 综合：  \n",
    " * 当$i \\neq k$时，$ \\dfrac{\\partial f_i}{\\partial W_{k}} = 0 $\n",
    " * 当$i = k$时，$ \\dfrac{\\partial f_i}{\\partial W_k}=x $  \n",
    "也即(仅第$i$行不为0)：  \n",
    "$ \\dfrac{\\partial f_i}{\\partial W}=\n",
    "\\left(\\begin{matrix}\n",
    "0 & 0 & ... & 0 \\\\\n",
    "0 & 0 & ... & 0 \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "x_1 & x_2 & \\cdots & x_n \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "0 & 0 & ... & 0 \\\\\n",
    "\\end{matrix}\\right)\n",
    "$\n",
    "* 最终结果：$f$的每一个分量$f_i$对$W$的导数，都是一个$(m,n)$矩阵，该矩阵的第$i$行为$x$，其余元素均为0  \n",
    "通过上面的推算过程可知，$\\dfrac{\\partial f}{\\partial W}$实际上是个$m$维向量，向量中每个元素都是一个$(m,n)$矩阵。因此$\\dfrac{\\partial f}{\\partial W}$共$m \\times m \\times n$个元素"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
